`

Googles search engine) which endpoints to index and which to

ignore. it hinted that the robots.txt file may have more entries than

just these two, and advised us to inspect it manually.

Lastly, it also identified another endpoint at /wp-login.php, which

is a login page for WordPress, a known blog platform. Navigate to

the main page at http://172.16.10.12/ to confirm youve identified a

blog.

Exercise 6: Automatically Exploring Non-Indexed Endpoints

Nikto advised us to manually explore the robots.txt file at

http://172.16.10.12/robots.txt to identify non-indexed endpoints.

Finding these endpoints is useful during a penetration test because

we can add them to our list of possible targets to test. If you open

this file, you should notice a list of paths:

User-agent: *

Disallow: /cgi-bin/

Disallow: /z/j/

Disallow: /z/c/

Disallow: /stats/

--snip--

Disallow: /manual

Disallow: /manual/*

Disallow: /phpmanual/

Disallow: /category/

Disallow: /donate.php

Disallow: /amount_to_donate.txt

We identified some of these endpoints earlier (such as

/donate.php and /wp-admin), but others we didnt see when scanning

with Nikto.

Now that we’ve found these endpoints, we can use bash to see

whether they really exist on the server. Lets put together a script

that will perform the following activities: make an HTTP request to

robots.txt, return the response and iterate over each line, parse the

output to extract only the paths, make an additional HTTP request to

each path separately, and check what status code each path returns to

find out if it exists.

Listing 5-1 is an example script that can help do this work. It

relies on a useful cURL feature you’ll find handy in your bash

scripts: built-in variables you can use when you need to make HTTP

requests, such as the size of the request sent

Black Hat Bash (Early Access) © 2023 by Dolev Farhi and Nick Aleks